R
25Winter
data: books.csv
Author

Aisha Aslam

Published

July 1, 2025

Introduction

If you love books, you are at just the right place :) Wouldn’t it be interesting to find out:

Is there a correlation between higher book ratings and the number of book reviews some popular books receive?

Let us find out using the following dataset.

The Dataset

We will be using the publicly accessible dataset ‘Good reads books’ . (https://www.kaggle.com/datasets/jealousleopard/goodreadsbooks ).

Show code
library("dplyr")
library("ggplot2")
library("plotly")
library("knitr")

A Sneak-peak into the dataset

Show code
goodreadsdata <- read.csv("../../../../data/books.csv")

kable(head(goodreadsdata))
X bookID title authors average_rating isbn isbn13 language_code num_pages ratings_count text_reviews_count publication_date publisher
1 1 Harry Potter and the Half-Blood Prince (Harry Potter #6) J.K. Rowling/Mary GrandPré 4.57 0439785960 9.780440e+12 eng 652 2095690 27591 2006-09-16 Scholastic Inc.
2 2 Harry Potter and the Order of the Phoenix (Harry Potter #5) J.K. Rowling/Mary GrandPré 4.49 0439358078 9.780439e+12 eng 870 2153167 29221 2004-09-01 Scholastic Inc.
3 4 Harry Potter and the Chamber of Secrets (Harry Potter #2) J.K. Rowling 4.42 0439554896 9.780440e+12 eng 352 6333 244 2003-11-01 Scholastic
4 5 Harry Potter and the Prisoner of Azkaban (Harry Potter #3) J.K. Rowling/Mary GrandPré 4.56 043965548X 9.780440e+12 eng 435 2339585 36325 2004-05-01 Scholastic Inc.
5 8 Harry Potter Boxed Set Books 1-5 (Harry Potter #1-5) J.K. Rowling/Mary GrandPré 4.78 0439682584 9.780440e+12 eng 2690 41428 164 2004-09-13 Scholastic
6 9 Unauthorized Harry Potter Book Seven News: Half-Blood Prince Analysis and Speculation W. Frederick Zimmerman 3.74 0976540606 9.780977e+12 en-US 152 19 1 2005-04-26 Nimble Books

Let’s visualize the book reviews and book ratings

Show code
#View(goodreadsdata)

ggplot(goodreadsdata, aes(x = average_rating, y = text_reviews_count)) +
  geom_point( colour = "blue") 

More ratings = more reviews :)